Dataset statistics
| Number of variables | 24 |
|---|---|
| Number of observations | 10000 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 6.0 MiB |
| Average record size in memory | 632.3 B |
Variable types
| NUM | 13 |
|---|---|
| CAT | 9 |
| BOOL | 2 |
Reproduction
| Analysis started | 2020-11-05 19:39:19.751370 |
|---|---|
| Analysis finished | 2020-11-05 19:40:10.945794 |
| Version | pandas-profiling v2.6.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
zip_code has a high cardinality: 720 distinct values | High cardinality |
emp_length has 250 (2.5%) zeros | Zeros |
delinq_2yrs has 8915 (89.1%) zeros | Zeros |
inq_last_6mths has 4607 (46.1%) zeros | Zeros |
mths_since_last_delinq has 6479 (64.8%) zeros | Zeros |
mths_since_last_record has 267 (2.7%) zeros | Zeros |
revol_bal has 278 (2.8%) zeros | Zeros |
revol_util has 254 (2.5%) zeros | Zeros |
is_bad
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 78.2 KiB |
| 0 | |
|---|---|
| 1 | 1295 |
| Value | Count | Frequency (%) | |
| 0 | 8705 | 87.1% | |
| 1 | 1295 | 13.0% |
| Distinct count | 11 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.8554 |
|---|---|
| Minimum | 0 |
| Maximum | 10 |
| Zeros | 250 |
| Zeros (%) | 2.5% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 4 |
| Q3 | 8 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 3.461659883 |
|---|---|
| Coefficient of variation (CV) | 0.7129505053 |
| Kurtosis | -1.362680399 |
| Mean | 4.8554 |
| Median Absolute Deviation (MAD) | 3.0513982 |
| Skewness | 0.366779797 |
| Sum | 48554 |
| Variance | 11.98308915 |
| Value | Count | Frequency (%) | |
| 10 | 2168 | 21.7% | |
| 1 | 2083 | 20.8% | |
| 2 | 1183 | 11.8% | |
| 3 | 1010 | 10.1% | |
| 4 | 889 | 8.9% | |
| 5 | 779 | 7.8% | |
| 6 | 535 | 5.3% | |
| 7 | 421 | 4.2% | |
| 8 | 351 | 3.5% | |
| 9 | 331 | 3.3% |
| Value | Count | Frequency (%) | |
| 0 | 250 | 2.5% | |
| 1 | 2083 | 20.8% | |
| 2 | 1183 | 11.8% | |
| 3 | 1010 | 10.1% | |
| 4 | 889 | 8.9% |
| Value | Count | Frequency (%) | |
| 10 | 2168 | 21.7% | |
| 9 | 331 | 3.3% | |
| 8 | 351 | 3.5% | |
| 7 | 421 | 4.2% | |
| 6 | 535 | 5.3% |
home_ownership
Categorical
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 78.2 KiB |
| rent | |
|---|---|
| mortgage | |
| own | 775 |
| other | 34 |
| none | 1 |
| Value | Count | Frequency (%) | |
| rent | 4745 | 47.4% | |
| mortgage | 4445 | 44.5% | |
| own | 775 | 7.8% | |
| other | 34 | 0.3% | |
| none | 1 | < 0.1% |
Length
| Max length | 8 |
|---|---|
| Mean length | 5.7039 |
| Min length | 3 |
| Value | Count | Frequency (%) | |
| Lowercase_Letter | 10 | 100.0% |
| Value | Count | Frequency (%) | |
| Latin | 10 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 10 | 100.0% |
annual_inc
Real number (ℝ≥0)
| Distinct count | 1901 |
|---|---|
| Unique (%) | 19.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 68201.99124 |
|---|---|
| Minimum | 2000 |
| Maximum | 900000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 2000 |
|---|---|
| 5-th percentile | 23747 |
| Q1 | 40000 |
| median | 58000 |
| Q3 | 82000 |
| 95-th percentile | 143525 |
| Maximum | 900000 |
| Range | 898000 |
| Interquartile range (IQR) | 42000 |
Descriptive statistics
| Standard deviation | 48587.93007 |
|---|---|
| Coefficient of variation (CV) | 0.7124121919 |
| Kurtosis | 51.15844552 |
| Mean | 68201.99124 |
| Median Absolute Deviation (MAD) | 30244.93419 |
| Skewness | 4.880579183 |
| Sum | 682019912.4 |
| Variance | 2360786948 |
| Value | Count | Frequency (%) | |
| 60000 | 381 | 3.8% | |
| 50000 | 267 | 2.7% | |
| 40000 | 222 | 2.2% | |
| 75000 | 213 | 2.1% | |
| 30000 | 211 | 2.1% | |
| 65000 | 204 | 2.0% | |
| 48000 | 196 | 2.0% | |
| 70000 | 193 | 1.9% | |
| 45000 | 181 | 1.8% | |
| 80000 | 170 | 1.7% | |
| Other values (1891) | 7762 | 77.6% |
| Value | Count | Frequency (%) | |
| 2000 | 1 | < 0.1% | |
| 4080 | 1 | < 0.1% | |
| 4200 | 2 | < 0.1% | |
| 4800 | 2 | < 0.1% | |
| 5000 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 900000 | 2 | < 0.1% | |
| 860000 | 1 | < 0.1% | |
| 780000 | 1 | < 0.1% | |
| 744000 | 1 | < 0.1% | |
| 725000 | 1 | < 0.1% |
verification_status
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 78.2 KiB |
| not verified | |
|---|---|
| verified - income | |
| verified - income source |
| Value | Count | Frequency (%) | |
| not verified | 4367 | 43.7% | |
| verified - income | 3214 | 32.1% | |
| verified - income source | 2419 | 24.2% |
Length
| Max length | 24 |
|---|---|
| Mean length | 16.5098 |
| Min length | 12 |
| Value | Count | Frequency (%) | |
| Lowercase_Letter | 13 | 86.7% | |
| Space_Separator | 1 | 6.7% | |
| Dash_Punctuation | 1 | 6.7% |
| Value | Count | Frequency (%) | |
| Latin | 13 | 86.7% | |
| Common | 2 | 13.3% |
| Value | Count | Frequency (%) | |
| ASCII | 15 | 100.0% |
pymnt_plan
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 78.2 KiB |
| n | |
|---|---|
| y | 2 |
| Value | Count | Frequency (%) | |
| n | 9998 | > 99.9% | |
| y | 2 | < 0.1% |
purpose_cat
Categorical
| Distinct count | 27 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 78.2 KiB |
| debt consolidation | |
|---|---|
| credit card | |
| other | |
| home improvement | 800 |
| major purchase | 546 |
| Other values (22) |
| Value | Count | Frequency (%) | |
| debt consolidation | 4454 | 44.5% | |
| credit card | 1273 | 12.7% | |
| other | 1026 | 10.3% | |
| home improvement | 800 | 8.0% | |
| major purchase | 546 | 5.5% | |
| small business | 461 | 4.6% | |
| car | 349 | 3.5% | |
| wedding | 250 | 2.5% | |
| medical | 183 | 1.8% | |
| moving | 159 | 1.6% | |
| Other values (17) | 499 | 5.0% |
Length
| Max length | 33 |
|---|---|
| Mean length | 13.9381 |
| Min length | 3 |
| Value | Count | Frequency (%) | |
| Lowercase_Letter | 21 | 95.5% | |
| Space_Separator | 1 | 4.5% |
| Value | Count | Frequency (%) | |
| Latin | 21 | 95.5% | |
| Common | 1 | 4.5% |
| Value | Count | Frequency (%) | |
| ASCII | 22 | 100.0% |
| Distinct count | 720 |
|---|---|
| Unique (%) | 7.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 78.2 KiB |
| 100xx | 158 |
|---|---|
| 112xx | 141 |
| 945xx | 129 |
| 070xx | 125 |
| 606xx | 114 |
| Other values (715) |
| Value | Count | Frequency (%) | |
| 100xx | 158 | 1.6% | |
| 112xx | 141 | 1.4% | |
| 945xx | 129 | 1.3% | |
| 070xx | 125 | 1.2% | |
| 606xx | 114 | 1.1% | |
| 900xx | 107 | 1.1% | |
| 021xx | 99 | 1.0% | |
| 941xx | 95 | 0.9% | |
| 926xx | 94 | 0.9% | |
| 300xx | 93 | 0.9% | |
| Other values (710) | 8845 | 88.4% |
Length
| Max length | 5 |
|---|---|
| Mean length | 5 |
| Min length | 5 |
| Value | Count | Frequency (%) | |
| Decimal_Number | 10 | 90.9% | |
| Lowercase_Letter | 1 | 9.1% |
| Value | Count | Frequency (%) | |
| Common | 10 | 90.9% | |
| Latin | 1 | 9.1% |
| Value | Count | Frequency (%) | |
| ASCII | 11 | 100.0% |
addr_state
Categorical
| Distinct count | 50 |
|---|---|
| Unique (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 78.2 KiB |
| ca | |
|---|---|
| ny | 958 |
| fl | 714 |
| tx | 700 |
| nj | 482 |
| Other values (45) |
| Value | Count | Frequency (%) | |
| ca | 1748 | 17.5% | |
| ny | 958 | 9.6% | |
| fl | 714 | 7.1% | |
| tx | 700 | 7.0% | |
| nj | 482 | 4.8% | |
| va | 392 | 3.9% | |
| il | 386 | 3.9% | |
| pa | 378 | 3.8% | |
| ga | 357 | 3.6% | |
| ma | 331 | 3.3% | |
| Other values (40) | 3554 | 35.5% |
Length
| Max length | 2 |
|---|---|
| Mean length | 2 |
| Min length | 2 |
| Value | Count | Frequency (%) | |
| Lowercase_Letter | 24 | 100.0% |
| Value | Count | Frequency (%) | |
| Latin | 24 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 24 | 100.0% |
debt_to_income
Real number (ℝ≥0)
| Distinct count | 2585 |
|---|---|
| Unique (%) | 25.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.338704 |
|---|---|
| Minimum | 0 |
| Maximum | 29.99 |
| Zeros | 58 |
| Zeros (%) | 0.6% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2.129 |
| Q1 | 8.16 |
| median | 13.41 |
| Q3 | 18.6925 |
| 95-th percentile | 23.93 |
| Maximum | 29.99 |
| Range | 29.99 |
| Interquartile range (IQR) | 10.5325 |
Descriptive statistics
| Standard deviation | 6.754211507 |
|---|---|
| Coefficient of variation (CV) | 0.5063619004 |
| Kurtosis | -0.8546793248 |
| Mean | 13.338704 |
| Median Absolute Deviation (MAD) | 5.669516109 |
| Skewness | -0.008777611376 |
| Sum | 133387.04 |
| Variance | 45.61937308 |
| Value | Count | Frequency (%) | |
| 0 | 58 | 0.6% | |
| 12.48 | 16 | 0.2% | |
| 13.51 | 13 | 0.1% | |
| 10 | 13 | 0.1% | |
| 19.2 | 13 | 0.1% | |
| 18.14 | 13 | 0.1% | |
| 4.8 | 12 | 0.1% | |
| 17.82 | 12 | 0.1% | |
| 15.38 | 12 | 0.1% | |
| 22.43 | 12 | 0.1% | |
| Other values (2575) | 9826 | 98.3% |
| Value | Count | Frequency (%) | |
| 0 | 58 | 0.6% | |
| 0.11 | 1 | < 0.1% | |
| 0.12 | 1 | < 0.1% | |
| 0.13 | 1 | < 0.1% | |
| 0.14 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 29.99 | 1 | < 0.1% | |
| 29.93 | 1 | < 0.1% | |
| 29.92 | 1 | < 0.1% | |
| 29.83 | 1 | < 0.1% | |
| 29.74 | 1 | < 0.1% |
| Distinct count | 10 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.1481 |
|---|---|
| Minimum | 0 |
| Maximum | 11 |
| Zeros | 8915 |
| Zeros (%) | 89.1% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 11 |
| Range | 11 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.5061541358 |
|---|---|
| Coefficient of variation (CV) | 3.417651153 |
| Kurtosis | 54.83739854 |
| Mean | 0.1481 |
| Median Absolute Deviation (MAD) | 0.2640623 |
| Skewness | 5.640791298 |
| Sum | 1481 |
| Variance | 0.2561920092 |
| Value | Count | Frequency (%) | |
| 0 | 8915 | 89.1% | |
| 1 | 822 | 8.2% | |
| 2 | 186 | 1.9% | |
| 3 | 50 | 0.5% | |
| 4 | 14 | 0.1% | |
| 5 | 6 | 0.1% | |
| 6 | 3 | < 0.1% | |
| 7 | 2 | < 0.1% | |
| 11 | 1 | < 0.1% | |
| 8 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 8915 | 89.1% | |
| 1 | 822 | 8.2% | |
| 2 | 186 | 1.9% | |
| 3 | 50 | 0.5% | |
| 4 | 14 | 0.1% |
| Value | Count | Frequency (%) | |
| 11 | 1 | < 0.1% | |
| 8 | 1 | < 0.1% | |
| 7 | 2 | < 0.1% | |
| 6 | 3 | < 0.1% | |
| 5 | 6 | 0.1% |
| Distinct count | 20 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.0664 |
|---|---|
| Minimum | 0 |
| Maximum | 25 |
| Zeros | 4607 |
| Zeros (%) | 46.1% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 4 |
| Maximum | 25 |
| Range | 25 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.475875625 |
|---|---|
| Coefficient of variation (CV) | 1.383979393 |
| Kurtosis | 23.68251185 |
| Mean | 1.0664 |
| Median Absolute Deviation (MAD) | 1.01822448 |
| Skewness | 3.116512638 |
| Sum | 10664 |
| Variance | 2.178208861 |
| Value | Count | Frequency (%) | |
| 0 | 4607 | 46.1% | |
| 1 | 2684 | 26.8% | |
| 2 | 1431 | 14.3% | |
| 3 | 731 | 7.3% | |
| 4 | 227 | 2.3% | |
| 5 | 152 | 1.5% | |
| 6 | 76 | 0.8% | |
| 7 | 42 | 0.4% | |
| 8 | 27 | 0.3% | |
| 9 | 10 | 0.1% | |
| Other values (10) | 13 | 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 4607 | 46.1% | |
| 1 | 2684 | 26.8% | |
| 2 | 1431 | 14.3% | |
| 3 | 731 | 7.3% | |
| 4 | 227 | 2.3% |
| Value | Count | Frequency (%) | |
| 25 | 1 | < 0.1% | |
| 24 | 1 | < 0.1% | |
| 18 | 2 | < 0.1% | |
| 17 | 1 | < 0.1% | |
| 16 | 1 | < 0.1% |
| Distinct count | 91 |
|---|---|
| Unique (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.222 |
|---|---|
| Minimum | 0 |
| Maximum | 120 |
| Zeros | 6479 |
| Zeros (%) | 64.8% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 23 |
| 95-th percentile | 65 |
| Maximum | 120 |
| Range | 120 |
| Interquartile range (IQR) | 23 |
Descriptive statistics
| Standard deviation | 21.99844788 |
|---|---|
| Coefficient of variation (CV) | 1.663776122 |
| Kurtosis | 1.249060661 |
| Mean | 13.222 |
| Median Absolute Deviation (MAD) | 17.6746228 |
| Skewness | 1.556097754 |
| Sum | 132220 |
| Variance | 483.9317092 |
| Value | Count | Frequency (%) | |
| 0 | 6479 | 64.8% | |
| 30 | 69 | 0.7% | |
| 34 | 66 | 0.7% | |
| 23 | 65 | 0.7% | |
| 38 | 65 | 0.7% | |
| 24 | 64 | 0.6% | |
| 44 | 64 | 0.6% | |
| 20 | 63 | 0.6% | |
| 33 | 63 | 0.6% | |
| 18 | 61 | 0.6% | |
| Other values (81) | 2941 | 29.4% |
| Value | Count | Frequency (%) | |
| 0 | 6479 | 64.8% | |
| 1 | 6 | 0.1% | |
| 2 | 29 | 0.3% | |
| 3 | 40 | 0.4% | |
| 4 | 37 | 0.4% |
| Value | Count | Frequency (%) | |
| 120 | 1 | < 0.1% | |
| 115 | 1 | < 0.1% | |
| 97 | 1 | < 0.1% | |
| 96 | 1 | < 0.1% | |
| 95 | 1 | < 0.1% |
| Distinct count | 94 |
|---|---|
| Unique (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 114.1828 |
|---|---|
| Minimum | 0 |
| Maximum | 119 |
| Zeros | 267 |
| Zeros (%) | 2.7% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 91 |
| Q1 | 119 |
| median | 119 |
| Q3 | 119 |
| 95-th percentile | 119 |
| Maximum | 119 |
| Range | 119 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 20.78681778 |
|---|---|
| Coefficient of variation (CV) | 0.1820485903 |
| Kurtosis | 22.76765479 |
| Mean | 114.1828 |
| Median Absolute Deviation (MAD) | 8.85020928 |
| Skewness | -4.838179104 |
| Sum | 1141828 |
| Variance | 432.0917933 |
| Value | Count | Frequency (%) | |
| 119 | 9163 | 91.6% | |
| 0 | 267 | 2.7% | |
| 89 | 21 | 0.2% | |
| 116 | 18 | 0.2% | |
| 87 | 17 | 0.2% | |
| 92 | 17 | 0.2% | |
| 86 | 17 | 0.2% | |
| 104 | 16 | 0.2% | |
| 100 | 16 | 0.2% | |
| 114 | 16 | 0.2% | |
| Other values (84) | 432 | 4.3% |
| Value | Count | Frequency (%) | |
| 0 | 267 | 2.7% | |
| 6 | 1 | < 0.1% | |
| 11 | 1 | < 0.1% | |
| 17 | 1 | < 0.1% | |
| 20 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 119 | 9163 | 91.6% | |
| 118 | 11 | 0.1% | |
| 117 | 10 | 0.1% | |
| 116 | 18 | 0.2% | |
| 115 | 10 | 0.1% |
open_acc
Real number (ℝ≥0)
| Distinct count | 36 |
|---|---|
| Unique (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.3344 |
|---|---|
| Minimum | 1 |
| Maximum | 39 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 6 |
| median | 9 |
| Q3 | 12 |
| 95-th percentile | 18 |
| Maximum | 39 |
| Range | 38 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 4.525464023 |
|---|---|
| Coefficient of variation (CV) | 0.4848157378 |
| Kurtosis | 1.841019064 |
| Mean | 9.3344 |
| Median Absolute Deviation (MAD) | 3.51517792 |
| Skewness | 1.063972019 |
| Sum | 93344 |
| Variance | 20.47982462 |
| Value | Count | Frequency (%) | |
| 7 | 1035 | 10.3% | |
| 6 | 990 | 9.9% | |
| 8 | 937 | 9.4% | |
| 9 | 934 | 9.3% | |
| 10 | 805 | 8.1% | |
| 5 | 763 | 7.6% | |
| 11 | 692 | 6.9% | |
| 4 | 631 | 6.3% | |
| 12 | 577 | 5.8% | |
| 13 | 487 | 4.9% | |
| Other values (26) | 2149 | 21.5% |
| Value | Count | Frequency (%) | |
| 1 | 7 | 0.1% | |
| 2 | 163 | 1.6% | |
| 3 | 374 | 3.7% | |
| 4 | 631 | 6.3% | |
| 5 | 763 | 7.6% |
| Value | Count | Frequency (%) | |
| 39 | 1 | < 0.1% | |
| 36 | 2 | < 0.1% | |
| 35 | 1 | < 0.1% | |
| 33 | 3 | < 0.1% | |
| 32 | 1 | < 0.1% |
pub_rec
Categorical
| Distinct count | 4 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 78.2 KiB |
| 0 | |
|---|---|
| 1 | 550 |
| 2 | 18 |
| 3 | 5 |
| Value | Count | Frequency (%) | |
| 0 | 9427 | 94.3% | |
| 1 | 550 | 5.5% | |
| 2 | 18 | 0.2% | |
| 3 | 5 | 0.1% |
Length
| Max length | 1 |
|---|---|
| Mean length | 1 |
| Min length | 1 |
| Value | Count | Frequency (%) | |
| Decimal_Number | 4 | 100.0% |
| Value | Count | Frequency (%) | |
| Common | 4 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 4 | 100.0% |
| Distinct count | 8130 |
|---|---|
| Unique (%) | 81.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 14271.0074 |
|---|---|
| Minimum | 0 |
| Maximum | 1207359 |
| Zeros | 278 |
| Zeros (%) | 2.8% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 277.95 |
| Q1 | 3524.5 |
| median | 8645.5 |
| Q3 | 16952.25 |
| 95-th percentile | 44554.85 |
| Maximum | 1207359 |
| Range | 1207359 |
| Interquartile range (IQR) | 13427.75 |
Descriptive statistics
| Standard deviation | 25437.9082 |
|---|---|
| Coefficient of variation (CV) | 1.782488614 |
| Kurtosis | 570.4140985 |
| Mean | 14271.0074 |
| Median Absolute Deviation (MAD) | 11728.71446 |
| Skewness | 16.32424653 |
| Sum | 142710074 |
| Variance | 647087173.7 |
| Value | Count | Frequency (%) | |
| 0 | 278 | 2.8% | |
| 2227 | 6 | 0.1% | |
| 1763 | 6 | 0.1% | |
| 11628 | 5 | 0.1% | |
| 4801 | 5 | 0.1% | |
| 760 | 5 | 0.1% | |
| 5272 | 4 | < 0.1% | |
| 18550 | 4 | < 0.1% | |
| 15 | 4 | < 0.1% | |
| 5220 | 4 | < 0.1% | |
| Other values (8120) | 9679 | 96.8% |
| Value | Count | Frequency (%) | |
| 0 | 278 | 2.8% | |
| 1 | 2 | < 0.1% | |
| 3 | 2 | < 0.1% | |
| 5 | 1 | < 0.1% | |
| 6 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1207359 | 1 | < 0.1% | |
| 602519 | 1 | < 0.1% | |
| 508961 | 1 | < 0.1% | |
| 487589 | 1 | < 0.1% | |
| 423189 | 1 | < 0.1% |
| Distinct count | 1027 |
|---|---|
| Unique (%) | 10.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 48.451419 |
|---|---|
| Minimum | 0 |
| Maximum | 100.6 |
| Zeros | 254 |
| Zeros (%) | 2.5% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2.8 |
| Q1 | 25 |
| median | 48.7 |
| Q3 | 71.8 |
| 95-th percentile | 93.6 |
| Maximum | 100.6 |
| Range | 100.6 |
| Interquartile range (IQR) | 46.8 |
Descriptive statistics
| Standard deviation | 28.18384582 |
|---|---|
| Coefficient of variation (CV) | 0.5816928874 |
| Kurtosis | -1.094338897 |
| Mean | 48.451419 |
| Median Absolute Deviation (MAD) | 24.0658605 |
| Skewness | -0.01681450313 |
| Sum | 484514.19 |
| Variance | 794.3291651 |
| Value | Count | Frequency (%) | |
| 0 | 254 | 2.5% | |
| 48.7 | 34 | 0.3% | |
| 46.6 | 21 | 0.2% | |
| 43.4 | 20 | 0.2% | |
| 0.1 | 20 | 0.2% | |
| 55.4 | 19 | 0.2% | |
| 70 | 19 | 0.2% | |
| 47.6 | 19 | 0.2% | |
| 53.6 | 19 | 0.2% | |
| 56.8 | 19 | 0.2% | |
| Other values (1017) | 9556 | 95.6% |
| Value | Count | Frequency (%) | |
| 0 | 254 | 2.5% | |
| 0.03 | 1 | < 0.1% | |
| 0.1 | 20 | 0.2% | |
| 0.12 | 1 | < 0.1% | |
| 0.2 | 11 | 0.1% |
| Value | Count | Frequency (%) | |
| 100.6 | 1 | < 0.1% | |
| 100 | 1 | < 0.1% | |
| 99.9 | 4 | < 0.1% | |
| 99.8 | 5 | 0.1% | |
| 99.7 | 3 | < 0.1% |
total_acc
Real number (ℝ≥0)
| Distinct count | 75 |
|---|---|
| Unique (%) | 0.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 22.0103 |
|---|---|
| Minimum | 1 |
| Maximum | 90 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 6 |
| Q1 | 13 |
| median | 20 |
| Q3 | 29 |
| 95-th percentile | 44 |
| Maximum | 90 |
| Range | 89 |
| Interquartile range (IQR) | 16 |
Descriptive statistics
| Standard deviation | 11.70655792 |
|---|---|
| Coefficient of variation (CV) | 0.5318672584 |
| Kurtosis | 0.9259505914 |
| Mean | 22.0103 |
| Median Absolute Deviation (MAD) | 9.28843058 |
| Skewness | 0.8712513322 |
| Sum | 220103 |
| Variance | 137.0434983 |
| Value | Count | Frequency (%) | |
| 15 | 369 | 3.7% | |
| 20 | 365 | 3.6% | |
| 17 | 360 | 3.6% | |
| 12 | 357 | 3.6% | |
| 14 | 351 | 3.5% | |
| 19 | 346 | 3.5% | |
| 16 | 340 | 3.4% | |
| 18 | 339 | 3.4% | |
| 13 | 331 | 3.3% | |
| 22 | 329 | 3.3% | |
| Other values (65) | 6513 | 65.1% |
| Value | Count | Frequency (%) | |
| 1 | 3 | < 0.1% | |
| 2 | 10 | 0.1% | |
| 3 | 58 | 0.6% | |
| 4 | 115 | 1.1% | |
| 5 | 144 | 1.4% |
| Value | Count | Frequency (%) | |
| 90 | 1 | < 0.1% | |
| 81 | 1 | < 0.1% | |
| 80 | 1 | < 0.1% | |
| 79 | 1 | < 0.1% | |
| 78 | 1 | < 0.1% |
initial_list_status
Categorical
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 78.2 KiB |
| f | |
|---|---|
| m | 17 |
| Value | Count | Frequency (%) | |
| f | 9983 | 99.8% | |
| m | 17 | 0.2% |
Length
| Max length | 1 |
|---|---|
| Mean length | 1 |
| Min length | 1 |
| Value | Count | Frequency (%) | |
| Lowercase_Letter | 2 | 100.0% |
| Value | Count | Frequency (%) | |
| Latin | 2 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 2 | 100.0% |
mths_since_last_major_derog
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 78.2 KiB |
| 2 | |
|---|---|
| 3 | |
| 1 |
| Value | Count | Frequency (%) | |
| 2 | 3424 | 34.2% | |
| 3 | 3299 | 33.0% | |
| 1 | 3277 | 32.8% |
Length
| Max length | 1 |
|---|---|
| Mean length | 1 |
| Min length | 1 |
| Value | Count | Frequency (%) | |
| Decimal_Number | 3 | 100.0% |
| Value | Count | Frequency (%) | |
| Common | 3 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 3 | 100.0% |
policy_code
Categorical
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 78.2 KiB |
| pc3 | |
|---|---|
| pc5 | |
| pc1 | |
| pc2 | |
| pc4 |
| Value | Count | Frequency (%) | |
| pc3 | 2098 | 21.0% | |
| pc5 | 2025 | 20.2% | |
| pc1 | 1978 | 19.8% | |
| pc2 | 1962 | 19.6% | |
| pc4 | 1937 | 19.4% |
Length
| Max length | 3 |
|---|---|
| Mean length | 3 |
| Min length | 3 |
| Value | Count | Frequency (%) | |
| Decimal_Number | 5 | 71.4% | |
| Lowercase_Letter | 2 | 28.6% |
| Value | Count | Frequency (%) | |
| Common | 5 | 71.4% | |
| Latin | 2 | 28.6% |
| Value | Count | Frequency (%) | |
| ASCII | 7 | 100.0% |
cr_line_yrs
Real number (ℝ≥0)
| Distinct count | 50 |
|---|---|
| Unique (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1997.0153 |
|---|---|
| Minimum | 1970 |
| Maximum | 2069 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 1970 |
|---|---|
| 5-th percentile | 1984 |
| Q1 | 1994 |
| median | 1998 |
| Q3 | 2001 |
| 95-th percentile | 2006 |
| Maximum | 2069 |
| Range | 99 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 7.739099126 |
|---|---|
| Coefficient of variation (CV) | 0.003875332916 |
| Kurtosis | 21.64917142 |
| Mean | 1997.0153 |
| Median Absolute Deviation (MAD) | 5.24052386 |
| Skewness | 1.789777925 |
| Sum | 19970153 |
| Variance | 59.89365528 |
| Value | Count | Frequency (%) | |
| 2000 | 839 | 8.4% | |
| 1998 | 753 | 7.5% | |
| 1999 | 715 | 7.1% | |
| 2001 | 642 | 6.4% | |
| 1997 | 601 | 6.0% | |
| 1996 | 592 | 5.9% | |
| 1995 | 518 | 5.2% | |
| 1994 | 513 | 5.1% | |
| 2002 | 503 | 5.0% | |
| 2003 | 455 | 4.5% | |
| Other values (40) | 3869 | 38.7% |
| Value | Count | Frequency (%) | |
| 1970 | 14 | 0.1% | |
| 1971 | 11 | 0.1% | |
| 1972 | 13 | 0.1% | |
| 1973 | 18 | 0.2% | |
| 1974 | 14 | 0.1% |
| Value | Count | Frequency (%) | |
| 2069 | 9 | 0.1% | |
| 2068 | 7 | 0.1% | |
| 2067 | 6 | 0.1% | |
| 2066 | 2 | < 0.1% | |
| 2065 | 2 | < 0.1% |
cr_line_mths
Real number (ℝ≥0)
| Distinct count | 12 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.8571 |
|---|---|
| Minimum | 1 |
| Maximum | 12 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 4 |
| median | 7 |
| Q3 | 10 |
| 95-th percentile | 12 |
| Maximum | 12 |
| Range | 11 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 3.546200381 |
|---|---|
| Coefficient of variation (CV) | 0.5171574545 |
| Kurtosis | -1.244569145 |
| Mean | 6.8571 |
| Median Absolute Deviation (MAD) | 3.09104728 |
| Skewness | -0.1786189228 |
| Sum | 68571 |
| Variance | 12.57553714 |
| Value | Count | Frequency (%) | |
| 10 | 1057 | 10.6% | |
| 11 | 999 | 10.0% | |
| 12 | 972 | 9.7% | |
| 9 | 923 | 9.2% | |
| 1 | 904 | 9.0% | |
| 8 | 794 | 7.9% | |
| 7 | 771 | 7.7% | |
| 6 | 740 | 7.4% | |
| 5 | 740 | 7.4% | |
| 2 | 728 | 7.3% | |
| Other values (2) | 1372 | 13.7% |
| Value | Count | Frequency (%) | |
| 1 | 904 | 9.0% | |
| 2 | 728 | 7.3% | |
| 3 | 696 | 7.0% | |
| 4 | 676 | 6.8% | |
| 5 | 740 | 7.4% |
| Value | Count | Frequency (%) | |
| 12 | 972 | 9.7% | |
| 11 | 999 | 10.0% | |
| 10 | 1057 | 10.6% | |
| 9 | 923 | 9.2% | |
| 8 | 794 | 7.9% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
First rows
| is_bad | emp_length | home_ownership | annual_inc | verification_status | pymnt_plan | purpose_cat | zip_code | addr_state | debt_to_income | delinq_2yrs | inq_last_6mths | mths_since_last_delinq | mths_since_last_record | open_acc | pub_rec | revol_bal | revol_util | total_acc | initial_list_status | mths_since_last_major_derog | policy_code | cr_line_yrs | cr_line_mths | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 10 | mortgage | 50000.0 | not verified | n | medical | 766xx | tx | 10.87 | 0.0 | 0.0 | 0.0 | 119 | 15.0 | 0 | 12087 | 12.1 | 44.0 | f | 1 | pc4 | 1992.0 | 12 |
| 1 | 0 | 1 | rent | 39216.0 | not verified | n | debt consolidation | 660xx | ks | 9.15 | 0.0 | 2.0 | 0.0 | 119 | 4.0 | 0 | 10114 | 64.0 | 5.0 | f | 2 | pc1 | 2005.0 | 11 |
| 2 | 0 | 4 | rent | 65000.0 | not verified | n | credit card | 916xx | ca | 11.24 | 0.0 | 0.0 | 0.0 | 119 | 4.0 | 0 | 81 | 0.6 | 8.0 | f | 3 | pc4 | 1970.0 | 6 |
| 3 | 0 | 10 | mortgage | 57500.0 | not verified | n | debt consolidation | 124xx | ny | 6.18 | 1.0 | 0.0 | 16.0 | 119 | 6.0 | 0 | 10030 | 37.1 | 23.0 | f | 2 | pc2 | 1982.0 | 9 |
| 4 | 0 | 10 | mortgage | 50004.0 | verified - income | n | debt consolidation | 439xx | oh | 19.03 | 0.0 | 4.0 | 0.0 | 119 | 8.0 | 0 | 10740 | 40.4 | 21.0 | f | 3 | pc3 | 1999.0 | 10 |
| 5 | 0 | 4 | rent | 47028.0 | verified - income | n | other | 200xx | dc | 7.83 | 2.0 | 1.0 | 19.0 | 119 | 6.0 | 0 | 1715 | 26.4 | 25.0 | f | 3 | pc3 | 1999.0 | 12 |
| 6 | 0 | 10 | mortgage | 126000.0 | not verified | n | credit card | 103xx | ny | 14.28 | 0.0 | 0.0 | 0.0 | 119 | 18.0 | 0 | 5466 | 11.1 | 29.0 | f | 3 | pc1 | 1979.0 | 11 |
| 7 | 0 | 6 | mortgage | 42000.0 | verified - income source | n | debt consolidation | 891xx | nv | 10.29 | 0.0 | 0.0 | 0.0 | 119 | 9.0 | 0 | 10354 | 95.9 | 10.0 | f | 3 | pc3 | 2006.0 | 4 |
| 8 | 0 | 2 | mortgage | 50000.0 | verified - income | n | debt consolidation | 612xx | il | 15.36 | 0.0 | 2.0 | 0.0 | 119 | 11.0 | 0 | 19662 | 59.2 | 27.0 | f | 1 | pc5 | 2001.0 | 2 |
| 9 | 0 | 1 | rent | 40000.0 | not verified | n | car | 926xx | ca | 6.48 | 0.0 | 1.0 | 0.0 | 119 | 11.0 | 0 | 19998 | 18.3 | 23.0 | f | 1 | pc5 | 1995.0 | 5 |
Last rows
| is_bad | emp_length | home_ownership | annual_inc | verification_status | pymnt_plan | purpose_cat | zip_code | addr_state | debt_to_income | delinq_2yrs | inq_last_6mths | mths_since_last_delinq | mths_since_last_record | open_acc | pub_rec | revol_bal | revol_util | total_acc | initial_list_status | mths_since_last_major_derog | policy_code | cr_line_yrs | cr_line_mths | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 9990 | 0 | 10 | mortgage | 120000.0 | verified - income | n | home improvement | 481xx | mi | 14.44 | 1.0 | 0.0 | 4.0 | 119 | 14.0 | 0 | 14716 | 59.8 | 31.0 | f | 2 | pc2 | 1994.0 | 2 |
| 9991 | 0 | 10 | rent | 63000.0 | verified - income source | n | medical | 018xx | ma | 10.08 | 0.0 | 0.0 | 0.0 | 119 | 6.0 | 0 | 60 | 1.1 | 22.0 | f | 3 | pc1 | 1989.0 | 5 |
| 9992 | 0 | 10 | rent | 52000.0 | verified - income | n | debt consolidation | 124xx | ny | 23.70 | 0.0 | 0.0 | 70.0 | 119 | 8.0 | 0 | 15002 | 91.5 | 18.0 | f | 2 | pc5 | 1998.0 | 8 |
| 9993 | 0 | 10 | own | 95892.0 | verified - income | n | home improvement | 110xx | ny | 8.70 | 0.0 | 2.0 | 0.0 | 119 | 3.0 | 0 | 2139 | 30.6 | 7.0 | f | 3 | pc5 | 1995.0 | 7 |
| 9994 | 1 | 1 | rent | 24996.0 | verified - income source | n | debt consolidation | 913xx | ca | 3.79 | 0.0 | 0.0 | 0.0 | 119 | 2.0 | 0 | 4801 | 56.5 | 7.0 | f | 1 | pc1 | 2005.0 | 8 |
| 9995 | 0 | 5 | mortgage | 66250.0 | verified - income | n | wedding | 014xx | ma | 9.40 | 0.0 | 1.0 | 0.0 | 119 | 8.0 | 0 | 3656 | 24.1 | 10.0 | f | 2 | pc3 | 2001.0 | 9 |
| 9996 | 0 | 1 | rent | 26000.0 | verified - income source | n | debt consolidation | 112xx | ny | 20.49 | 0.0 | 1.0 | 79.0 | 119 | 8.0 | 0 | 6709 | 58.9 | 12.0 | f | 2 | pc3 | 2000.0 | 5 |
| 9997 | 0 | 8 | rent | 47831.0 | not verified | n | debt consolidation | 070xx | nj | 24.13 | 0.0 | 0.0 | 0.0 | 111 | 9.0 | 1 | 11346 | 60.7 | 17.0 | f | 3 | pc3 | 1989.0 | 12 |
| 9998 | 0 | 6 | mortgage | 70000.0 | not verified | n | major purchase | 244xx | va | 16.18 | 2.0 | 2.0 | 16.0 | 119 | 9.0 | 0 | 17157 | 50.9 | 27.0 | f | 2 | pc3 | 1999.0 | 3 |
| 9999 | 0 | 1 | rent | 70560.0 | not verified | n | credit card | 900xx | ca | 16.13 | 0.0 | 1.0 | 53.0 | 119 | 15.0 | 0 | 2304 | 22.6 | 34.0 | f | 2 | pc5 | 2000.0 | 9 |